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Abstract 

Many real networks can be understood as two complementary networks with two kind of nodes. 
This is the case of metabolic networks where the first network has chemical compounds as nodes 
and the second one has nodes as reactions. The second network can be related to the first one by 
a technique called line graph transformation (i.e., edges in an initial network are transformed into 
nodes). Recently, the main topological properties of the metabolic networks have been properly 
described by means of a hierarchical model. In our work, we apply the line graph transformation 
to a hierarchical network and the clustering coefficient C{k) is calculated for the transformed 
network, where k is the node degree. While C{k) follows the scaling law C{k) ^ k^^'^ for the 
initial hierarchical network, C{k) scales weakly as k^-^^ for the transformed network. These results 
indicate that the reaction network can be identified as a degree-independent clustering network. 

PACS numbers: 89.75.-k, 05.65.+b 
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I. INTRODUCTION 



Recent studies on network science demonstrate that cellular networks are described by 
universal features, which are also present in non-biological complex systems, as for example 
social networks or WWW JLj]. Most networks encountered in real world have scale- free 
topology, in particular networks of fundamental elements of cells as proteins and chemical 

.u;..::. bbq. m .... ..w.. a..*...,, of node gree follows a power-law as 

P{k) ~ k~'^ (i.e., frequency of the nodes that are connected to k other nodes). The degree 
of a node is the number of other nodes to which it is connected. 

One of the most successful models for explaining that scale-free topology was proposed 
by Barabdsi- Albert P, |^, which introduced a mean- field method to simulate the growth 
dynamics of individual nodes in a continuum theory framework. However, although that 
model was a milestone to understand the behavior of real complex networks, it could not 
reproduce all the observed features in real networks such as clustering dependence. In 
order to bring under a single framework all the observed properties of biological networks 
Rasvasz et al. suggested successfully a hierarchical and modular topology ~\. These observed 
properties of networks with N nodes are: scale-free of degree distribution P{k) ~ k~'^ , 
power-law scaling of clustering coefficient C{k) and a high value for the average of the 
clustering coefficient C{N) and its independence with network size. A network with these 
properties is called hierarchical network. 

In the hierarchical model [l] (the RSMOB model in what follows), the network is simul- 
taneously scale-free and has a high clustering coefficient that is independent of the network 
size. The key signature of hierarchical modularity is the dependence of the clustering coef- 
ficient as a function of the node degree k, which follows C{k) ~ k^^. The meaning of this 
result is that nodes with a few links have a high clustering degree, being the centers of nu- 
merous interlinked modules. On the other hand, highly connected nodes (hubs) have smaller 
clustering coefficient, being their tasks to connect different modules. In [2,y], it is shown 
that many real networks (biological and non-biological) have a hierarchical organization. 
One of them, which is the subject of our study, is the metabolic network. 

It is also interesting to note that the metabolic network is an example of bipartite networks 



m 



10|. In a bipartite network there are two kinds of nodes and edges only connect nodes of 



different kinds. In the metabolic network these nodes are chemical compounds and reactions. 
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The network generated by the chemical compounds (reactions) is called compound (reaction) 
projection. A line graph transformation (i.e., each edge between two compounds becomes 
a node (reaction) of the transformed network) relates both projections. A detailed analysis 
of the line graph transformation focused on the degree distribution P{k) and applied to the 
metabolic network can be found in jjllj]. In that work, similarities and differences between the 
line graph transformation and the metabolic network are discussed. There it was found that 
if the initial network follows a power-law P{k) ~ k"'^, the transformed network preserves the 
scale-free topology and in most cases the exponent is increased by one unit as P{k) ~ k"'^'^^. 

The observed topological properties related to the clustering degree of the metabolic 
network (in particular, the chemical compound network) have been properly described by 
means of the RSMOB model. In the present work, our aim is to study the clustering coeffi- 
cients C{k) and C{N) of the reaction network by using two approaches: Firstly, we derive 
mathematical equations of those coefficients in the transformed network. Secondly, we ap- 
ply the line graph transformation to a hierarchical network. The results from both methods 

n 

are compared with experimental data of reactions from KEGG database J2] showing a 
good agreement. Though we started this work motivated by theoretical interest in the line 
graph transformation, the results provide explanation for the difference of C{k) between the 
compound network and the reaction network. 

In our work, the hierarchical network is generated by the RSMOB model, where the 
nodes correspond to chemical compounds and the edges correspond to reactions. While the 
RSMOB model reproduces successfully the hierarchical properties of the compound network, 
here we show that this hierarchical model also stores adequate information to reproduce the 
experimental data of the reaction network. Our study indicates that it is enough to apply 
the line graph transformation to the hierarchical network to extract that information. While 
C{k) follows the power- law k~^-^ for the initial hierarchical network (compound network), 
C{k) scales weakly as k^'^^ for the transformed network (reaction network). Consequently, 
we conclude that the reaction network can not be defined as a hierarchical network. 

It is also worth noting that the line graph transformation has recently been applied 
with success on the protein interaction network with the aim to detect functional mod- 
ules. In that work, the edges (interactions) between two proteins become the nodes of the 
transformed network (interaction network). By means of the line graph transformation, 
the interaction network has its structure level more increased than that from the protein 
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network (i.e., higher clustering coefficient). By using the TribeMCL algorithm [ij] they 
are able to detect clusters in the more highly clustered interaction network. These clusters 
are transformed back to the initial protein-protein network to identify which proteins con- 
form functional clusters. At this point, we note that the aim of our study is not to detect 
functional modules from the metabolic network. In our work the line graph transformation 
is used successfully to evoke topological properties related to the clustering degree of the 
reaction network. 

The paper is organised as follows. In Sec. II, we describe the theoretical concepts in 
our approach, and explain the mathematical methods used in our analysis. In Sec. Ill 
we present the experimental data of metabolic pathways of the KEGG database for C{k) 
and C{N), and we compare with our theoretical predictions before and after the line graph 
transformation is done. The final section summarizes our work. 

II. THEORETICAL APPROACH 

A. Clustering coefficients C{k) and C{N) 

Recent analyses have demonstrated that the metabolic network has a hierarchical organi- 
zation, with properties as: scale-free degree distribution P{k) ~ k~'^, power-law dependence 
of clustering coefficient C{k) ~ k~^ and independence with network size of the average 
clustering coefficient C{N), where is the total number of nodes in a network j^. The 
clustering coefficient can be defined for each node i as: 



where rii denotes the number of edges connecting the fcj nearest neighbors of node i to each 
other. Ci is equal to 1 for a node at the center of a fully interlinked cluster, and it is for 
a node that is a part of a loosely connected cluster i7|. An example can be seen in Fig. 
1(a). Geometrically, gives the number of triangles that go through node i. The factor 
ki{ki — l)/2 gives the total number of triangles that could go through node i (i.e., total 
number of triangles obtained when all the neighbors of node i are connected to each other). 
In the case of Fig. 1(a), there is one triangle that contains node 1 (dash-dotted lines), and 
a total of 6 triangles could be generated as the maximum. Hence, the clustering coefficient 
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FIG. 1: a) Example of clustering in an undirected network. Continuous and dash-dotted lines mean 
interaction between nodes. In addition, the dash-dotted line defines the only triangle where the 
node 1 (red) is one of the vertices. The node 1 has 4 neighbors {ki = 4), and among these neighbors 
only one pair is connected (ni=l). The total number of possible triangles that could go through 
node i is 6. Thus, the clustering coefficient has the value Ci = 1/6. High density of triangles means 
high clustering coefficient, b) We show an example of the line graph transformation. The initial 
graph G corresponds to one subgraph which belongs to the Lysine Biosynthesis metabolic pathway. 
This graph is constructed by taking nodes as chemical compounds and edges as reactions. By 
applying the line graph transformation we find graph L{G), which is the reaction graph embedded 
in the graph G. The nodes of the graph L(G) are the reactions of the graph G 

of node 1 is Ci = 1/6. 

On the other hand, the average clustering coefficient C{N) characterizes the overall ten- 
dency of nodes to form clusters as a function of the total size of the network A^. The 
mathematical expression is: 



C{N) = -J2c.ih^- (2) 

i 

The structure of the network is given by the function C{k), which is defined as the average 
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clustering coefficient over nodes with the same node degree k. This function is written as: 



where is the number of nodes with degree k, and the sum runs over the Nk nodes with 
degree k. A scahng law k^^ for this magnitude is an indication of the hierarchical topology 
of a network. 

Once the theoretical definitions have been introduced, our aim is to analyse how the 
coefficients C{N) and C{k) are modified under the line graph transformation. 



B. Line graph transformation on metabolic networks: spurious nodes 

Given an undirected graph G, defined by a set of nodes V{G) and a set of edges E{G), 
we associate another graph L{G), called the line graph of G, in which V{L{G)) = E{G), and 
two vertices are adjacent if and only if they have a common endpoint in G (i.e., E{L{G)) = 
{{{u,v), {v,w)}\{u,v) G E{G), {v,w) G E{G)}). This construction of graph L{G) from the 
initial graph G is called line graph transformation jisl . 



It is worth noting that in a previous work 



the degree distribution P{k) was studied by 



applying line graph transformation to synthetic and real networks. There it is assumed an 
initial graph G with scale-free topology as P{k) ^ k~^. As the degree of each transformed 
node (i.e., an edge in G) will be roughly around k, the distribution of the line graph L{G) 
should be k ■ k~'^ = k~"'^^ with degree around k. Therefore, it is concluded that if we have a 
graph G with a probability distribution following a power-law as then L{G) will follow 
a power-law as k'"*^^. The real networks under study were protein-protein interaction, 
WWW, and metabolic networks. In Fig. 1(b), we can see an example of the line graph 
transformation applied to a subgraph of the metabolic network. 

However, it is important to point out one issue. In metabolic networks, there are cases 
where spurious nodes appear. For example, we consider two reactions sharing the same 
substrate (or product) and at least one of the chemical reaction has more than one product 
(or substrate). If we apply a line graph transformation to this network, we would obtain 
more than two nodes in the transformed network, where only two nodes (reflecting two 
reactions present in the real process) should appear. These spurious nodes appear only 
when one (or some) reaction(s) in the network has more than one product (or substrate). 



Therefore, these cases should be computed and transformed by generating only as many 
nodes in the transformed network as reactions in the real metabolic process. This procedure 
is called physical line graph transformation. In the present work, we have applied this 
procedure to generate the reaction network by using experimental data from the KEGG 
database. Experimental data Q| are shown later in Figs. 6 and 7 (blue diamonds). More 
detailed information about this issue can be found in 11 ill. 



C. Equations of C{k) and C{N) under the line graph transformation 

We assume a graph G as it is depicted in Fig. 2(a). In this graph, edge a connects two 
nodes with degree k' and k" . We apply the line graph transformation to this graph G and 
the result of this transformation is the line graph of G, L{G) shown in Fig. 2(b). We see 
that, under the line graph transformation, the nodes of L{G) are the edges of G, with two 
nodes of L{G) adjacent whenever the corresponding edges of G are. 

The clustering coefficient for the node a in the transformed network can be written by 
using Eq. (1) as: 

^ 2[{k'-l).{k'-2)/2 + {k"-l).{k"-2)/2] 

(A;'-1 + A;"-1)(A;'-1 + A;"-1-1) ' ^> 

where k = k' + k" — 2, because the edge a vanishes in the graph L{G). This equation ignores 
cases where edges in the graph G, b and b' for example, have a common node as endpoint 
(i.e., existence of triangles or loops in Fig. 2(c)). However, we can quantify these cases by 
using a new parameter /. As we can see in Fig. 2(c)-(d), edges with one common node as 
endpoint in the graph G means one additional edge in the graph L{G). This additional edge 
in L{G) connects two neighbors of node a. By following definition of Eq. (1), it means that 
Ha increases its value by one unit. We can consider these cases by increasing one unit the 
parameter / for each common node as endpoint of two edges in the graph G (for example, 
/ = 1 means one common node). We write Eq. (4) after introducing the parameter / as (l7l |: 

^ 2[{k' -l).{k' -2)/2 + {k" -l)-{k" -2)/2 + l] 
(k' -1 + k" -l){k' -1 + k" -1-1) ' 

where if Z = means that there are not loops and we recover Eq. (4). It should be noted 
that / always contributes to increasing the value of Ca{k) and Ca{k) < 1 always holds from 
the definition. In order to study the limits of Eq. (5) we consider the following two cases: 
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(c) (d) 

FIG. 2: (a) Graph G with two hubs with degree k' and k" connected by edge a. (b) The 
corresponding hne graph L(G) after the hne graph transformation is done, (c) Graph G where 
edges b and b' have a common node as endpoint. (d) Line graph of (c). It is worth noticing that 
(d) has only one more edge than (b). Hence, (d) has one more triangle that go through node a 
than (b). 

• a) k'—k": We analyse the case where both degrees have the same value. We also 
consider the cases when I — and Z 7^ in order to study the effect of triangles. We 
show the results in Fig. 3. For large k', Eq. (5) goes asymptotically to 1/2 for / = 
and / 7^ 0. Wc also see that for k' > 25, all lines arc very close to 1/2. For low k' and 
/ = 0, Ca{k) takes values from 0.33 {k' =3) to 0.48 {k' =20). Hence, we see in Fig. 3 
that higher values of I (more triangles) increase the values of Ca{k). 

• b) k" = constant, k' » k": We plot in Fig. 4 three cases, k" is fixed with constant 
values as k" — 5 (black), k" — 10 (red), k" — 20 (blue) and k' is a free parameter. 
We see that Ca{k) approaches to 1 when k' takes large values. For low k', the case k" 
— 5 shows a minimum with a few values of k' below 1/2. As we can see with dotted 
and dash-dotted hues in Fig. 4, the presence of triangles (/ 7^ 0) increases the value 
of Ca{k). Finally, for k" =10 and k" =20, we see that only a few values of Ca{k) 
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are slightly below 1/2 for low k'. This analysis is complemented by calculating the 

minimun value of Ca(k) analitically as: ^^rr- = 0. The value of k', where the function 

fi 

Ca{k) takes the minimum value, is given by 



, -l + l + k" + J2 + 31 + P- 7k" - 5lk" + 9{k"y + 2l{k"Y - ^{k"Y + {k"Y 

k = ; 

-1 + k" 

(6) 

By substituting this equation into Eq. (5), it is possible to calculate the minimum 
value of Ca{k) for each configuration of I and k". 

From these two cases, we can conclude that for hubs (i.e., those nodes with high degree 
{k' and k" » 1)) and for highly clustered networks (many triangles / >> 1), the values of 
Ca{k) in the transformed network are between around [|, 1]. 




FIG. 3: Values of Ca{k) from Eq. (5) calculated by taking k'=k". Number of common nodes as 
endpoint of two edges (triangles) are indicated by the parameter /. The degree of transformed 
nodes is k = k' + k" — 2 because the edge a vanishes in the graph L{G). 
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FIG. 4: Values of Ca{k) from Eq. (5) calculated by taking k" with constant values as k" = 5 (black 
line), k" = 10 (red), k" = 20 (blue) and k' as a free parameter. Dotted and dash-dotted lines show 
the presence of triangles (/ 7^ 0). Triangles increase the value of Ca{k). 

To calculate the distribution of C{k) in the transformed space {C'^ik)) we introduce the 
concept of assort at ivity. By assortative (disassortative) mixing in networks we understand 

the preference for nodes jwith high degree to connect to other high (low) degree nodes jioj ]. 

I I 

By following Newman [19], we define the probability distribution to choose a randomly 
edge with two nodes at either end with degrees k' and k" as ek'k"- We also assume that 
the nodes of the initial network are following a power-law distribution k~"' and have no 
assortative mixing. Under these assumptions, the probability distribution ey k" of edges 
that link together nodes with degree k' + k" can be written as: 



We make a convolution between Eq. (4) and Eq. (7), by summing for all the possible 
degrees of the two nodes at either end of edges {k',k"), which can generate transformed 



(7) 



Gk'k" — 



J2k' ^' Zlfc" ^" 
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nodes with degree k = k' + k" — 2. Thus, we obtain: 

^ ^'^> - V k'-l+^ ■ H-7+1 ■ ^ ' 

Z^k=k'+k"-2 

According to the structure of C^{k) and the behavior of Ca{k) exposed above, C'^{k) will 
grow smoothly for large k, i.e., scaling weakly with the node degree k. We have calculated 
numerically this expression and the results are discussed later in Fig. 6. 

We have also calculated the analytical expression for C{N), and we have found that C{N) 
has a size-independent behavior before and after the line graph transformation is done. We 
can write the number of nodes with degree k as: 

■ k~'^ 

Nk cc (9) 

and we assume that C{k) = A - k""", where A is a constant. This constant changes when we 
consider hierarchical networks with different number of nodes in the initial cluster JZi]. But 
it seems natural because in that case the degree distribution Pik) ~ /c~'^ of the network also 
changes. For C{N) before the transformation we can write 20|: 

^(^) = ^ E - ^^c{k) = A ^^i' . (10) 

1=1 k=2 l^k=l 

Furthermore, if we use the RSMOB model (explained in next section), Eq. (10) takes the 
form: 

^ |(.m-l)^]-V 

where j is the number of iterations, m is the number of nodes in the initial fully connected 
cluster and A' is a constant adjusted so that C(A^)=1 holds for j = 1. The upper limit of 
the summation log,^ A^ is obtained by means of the expression = N, which gives the total 
number of nodes in the network and 7' = ,^^1° ™'-^^ denotes the exponent of the power-law 
distribution of hubs in the RSMOB model [2]| . The approximately equal symbol indicates 
that Eq. (11) is valid for hub nodes, and non-hub nodes are not considered. 

By using these equations, we will see later (Tables 1 and 2) that C{N) converges to a 
constant. In order to calculate C{N) after the line graph transformation is applied (C^(A^)), 
we make the substitution C{k) C^{k) in Eq. (10). As from Eq. (8) we have seen that 
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C'^{k) is almost constant, we can conclude that C'^{N) also has a constant behavior and it 
is almost independent with network size. 

While the scaling law of C{k) ~ k^^ was proved mathematically in Ql, here we have 
obtained the analytical expressions of C'^{k), C{N) and C'^{N). 

III. RESULTS AND DISCUSSION 

The RSMOB model 0] is able to reproduce the main topological features of the metabolic 
network. We follow the method described in [3] and generate a hierarchical network. Then, 
we apply the line graph transformation to that network. 

Fig. 5 illustrates the hierarchical network generated by the RSMOB model. The network 
is made of densely linked 5-node modules (it is worth noticing that the number of nodes in 
the initial module can be different than 5) that are assembled into larger 25-node modules 
(iteration n=l, 5^ = 25 nodes). In the next step four replicas are created and the peripheral 
nodes are connected again to produce 125-node modules (iteration n=2, 5'^ = 125 nodes). 

pn 

This process can be repeated indefinitely [7|, |8|| . 



(a) «=0, N=5. 




(b)«=l,iV=25. (c)«=2,iV=125. 

FIG. 5: Hierarchical network generated by using the RSMOB model j^. Starting from a fully 
connected cluster of 5 nodes, 4 identical replicas are created, obtaining a network of N=25 nodes 
in the first iteration n=l (5^ = 25 nodes). The model differs slightly from that one from ^ because 
we have linked to each other the central hubs of the replicas. This process can repeated indefinitely. 
We note that the initial number of nodes can be different than 5. 
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To evaluate C{k), we have constructed three hierarchical networks with 3, 4, and 5 initial 
number of nodes. These networks were generated up to 7 (6561), 5 (4096), and 4 (3125) 
iterations (nodes), repectively. Once we have constructed these three networks, we apply 
the line graph transformation to them, and we calculate the C^{k) clustering coefficient for 
the transformed networks. In Fig. 6(a) we show the results of the clustering coefficient of 
the transformed network. Circles, triangles and squares indicate the values of C{k) for the 
transformed network with 3, 4, and 5 initial nodes, respectively. In Fig. 6(a) we also plot 
with continuous lines the values of C'^{k) obtained from Eq. (8). From top to bottom the 
lines correspond to the networks of 3, 4 and 5 initial nodes, respectively. In Fig. 6(a), we see 
that the lines show an acceptable agreement with the overall tendency of data generated by 
the transformed network. In Fig. 6(b), we see that the results from theoretical calculation of 
C^(fc) via Eq. (8) (lines') are in good agreement with the experimental data (diamonds) from 



the KEGG database 



12| . The only disagreement comes at k = 2. This is easy to understand 



because in the hierarchical model depicted in Fig. 5, we can only find C(A; = 2) = 1 for 3 
initial nodes by construction of the network. However, in real networks, we could find nodes 
which have only two neighbors and, in some cases, these neighbors could be connected. In 
these cases the clustering coefficient takes value one. 

In Fig. 7, we show the results for C{k) after the line graph transformation is applied to 
the hierarchical network generated by 4 initial nodes and up to 5 iterations. The results are 
shown with empty triangles (red) and fitted to the dashed line. We see that C{k) ~ k~^'^ 
changed into C^{k) ~ k^'^^. We also see that the line graph transformation increases the 
average of the clustering value of the transformed network. These theoretical results were 
compared with the experimental data from KEGG jl2|, finding a good agreement, and 
supporting the result of a degree- independent clustering coefficient C'^{k) for the reaction 
network. 

For C{N) we have evaluated Eq. (10) for 3 different configurations. We have considered 
3 initial nodes, 4 and 5 initial nodes nodes up to 7, 5 and 4 iterations, respectively. As it is 
explained in 0,0], C{N) approaches asymptotically to a constant value, being independent 
of the size of the network. The asymptotic value depends on the initial number of nodes. 
We calculated the values of 7 corresponding to the degree distrution P{k) ~ k~^ for each 
network, and the related constant A, which appears in Eq. (10). We show in Table 1 the 
values of these parameters and the results of C{N) obtained by Eq. (10). These values, as 
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FIG. 6: a) We plot the results of the hierarchical model for C'^{k) for different configurations. 3 
initial nodes and up to 7 iterations (circles), 4 initial nodes and up to 5 iterations (triangles), 5 
initial nodes and up to 4 iterations (squares). From top to bottom (3 initial nodes (black), 4 initial 
nodes (red), 5 initial nodes (green)), we show with lines the results of C^ik) obtained by means 
of Eq. (8). b) The lines have the same meaning as before and the diamonds correspond to the 
experimental data for reactions from the KEGG database [3]. Experimental data involves 163 
organisms. 

it can be seen in Fig. 8(a), are below the asymptotic values of ~ 0.66 (circles) and ~ 0.74 
(triangles) obtained by using the RSMOB model. However, we have found an explanation 
for this result. In Fig. 7, the full circles at the top of the dash-dotted line correspond to 
non-hubs nodes. We have checked that these nodes do not follow a power-law, hence the 
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FIG. 7: Full circles (red) and dot-dashed line (red): C{k) evaluated with the hierarchical network. 
Empty triangles (red) and dashed line (red): C{k) after the line graph transformation is done 
over the hierarchical network (C-(t)), Diatronds (blue): C(k) of reactions data from the KEGG 
database ^12:]. Empty circles (yellow) and continuous line: C{k) of compounds data from KEGG. 
Hierarchical model with 4 initial nodes and 5 iterations. 

value of C{k) is beine overestimated by the scaling dependence k~^ and it provides a larger 
value of C{N). In j7|, the values of C{N) from hierarchical model were compared with the 
experimental values of 43 organisms. The values of C{N) for each organism were around 
0.15 - 0.25. By using the KEGG database we have evaluated the experimental value C{N) 
for 163 organisms and we obtained an average value of 0.08. 

We show in Fig. 8(a) the values of C{N) calculated for networks generated by 3 initial 
nodes (circles) and 4 initial nodes (triangles) by using the RSMOB model. We see that C{N) 
approaches asymptotically to constant values around ~ 0.66 (circles) and ~ 0.74 (triangles), 
being independent of the size of the network. Once the line graph transformation is applied, 
we see that the corresponding values of C'^{N) also approach asymptotically to constant 
values. Hence, C'^{N) also is size-independent for large (empty circles and triangles). In 
addition, we have averaged the experimental value of the clustering coefficient for reactions 
of 163 organisms found in KEGG database and we have obtained the value of C"^(A^)=0.74. 



15 



m initial ( total ) nodes 


7 


a 


A 


CiN) (Eq.(lO)) 


3 (6561) 


2.58 


1.1 


2.34 


0.20 


4 (4096) 


2.26 


1.1 


3.68 


0.36 


5 (3125) 


2.16 


1.1 


5.18 


0.54 



TABLE I: Results of C{N) evaluated by using Eq. (10) and the needed parameters in that cal- 
culation for 3 different setups: 7 = 1 + 7', where 7'= jj^J^^ {P{k) ~ k'^), a {C{k) ~ A;""), A 
{C{k) = A ■ k^"'). Eq. (10) is a general expression of C{N) . 



m initial ( total ) nodes 


7' 


a 


C{N) (Eq.(ll)) 


3 (6561) 


1.58 


1.1 


0.78 


4 (4096) 


1.26 


1.1 


0.81 


5 (3125) 


1.16 


1.1 


0.83 



TABLE IL Results of C{N) evaluated by using Eq. (11) for 3 different setups. The exponent of 
the power-law distribution of hubs is given by 7'= j^^z^)- The parameter a has same meaning as 
in Table 1. We also notice that in Eq. (11), A' is adjusted so that C{N)=\ holds for j = 1. Eq. 
(11) is the particular expression of C{N) applied to the RSMOB model. 

We see that the experimental value C'^{N) for reactions is in good agreement with the 
asymptotic values obtained by the transformed network (empty triangles and circles). 

Furthemore, we have also calculated C{N) by using Eq. (11). This equation should 
reproduce the results of C{N) calculated by using the RSMOB model (dark circles and 
triangles in Fig. 8(a)). In Fig. 8(b), we see that the results are qualitively similar to those 
shown in Fig. 8(a) (dark circles and triangles). 

We remark that the theoretical analysis of C{N) and C'^{N) done here has also been 
useful to prove that they are independent of network size. 

Finally, in Fig. 9 we plot the hierarchical network (left) and the transformed network 

22j . We see the high degree of compactness 



(right) by using the graph drawing tool Pajek 
of the transformed network. It could be related to the concept of robustness of a network. 
It means that by removing one node randomly from the reaction network depicted in the 
Fig. 9, the normal behavior of the cell might be preserved by finding an alternative path 
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1 : •* (a) 
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0.8; 
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FIG. 8: a) Dark (black): C{N) is calculated by using the hierarchical network. Light (green): 
C'^{N) {C{N) after the line graph transformation is applied to the hierarhical network). Circles (3 
initial nodes), Triangles (4 initial nodes). Star (red): Experimental C'^{N) for reactions from the 
KEGG database b) C{N) is calculated by using Eq. (11). The results show a good agreement 
and similar tendency to those shown in Fig. 8(a) (dark circles and triangles). 

(reaction) to complete the task. This fact could be a consequence of the high degree of 
clustering and connectivity between the nodes in the transformed network. 

At the end of this section, it is convenient to summarize our findings in Table 3. We 
can see in Table 3, with f and * symbols, the functions studied analytically and evaluated 
by us. We see that central properties of networks were studied by using the line graph 
transformation technique, which suggests the effectiveness of the method. 



1 . 



0.8 ; 



_,0.6: 



0.4 r 



0.2 r 



17 



FIG. 9: Left: Hierarchical network generated by using the model of ref. |^ with 4-node modules and 
up to 2 iterations. Right: Network after the line graph transformation. We see a huge interlinked 
cluster in the center of figure, which generates the degree-independent clustering coefficient C{k) 
(it scales weakly as C{k) ~ /^O-OS ). 



Func. 


Definition 


Dependence before 


Dependence after 


P{k) 


Nk/N 


k-"< 




Ci{k) 


2n/[h{ki - 1)] 




y^O.OSf 


C{N) 




size- independent^ 


size-independent^ 



TABLE III: Definitions of functions and their values before and after the line graph transformation 
is applied to the hierarchical network. N^: number of nodes of degree k. The f symbol means 
that these dependences were analyzed in the present work, while the -k symbol means that it was 



studied in our previous work 
IV. CONCLUSIONS 



We have studied here the clustering coefficients C{k) and C{N) of the reaction network 
by applying the line graph transformation to a hierarchical network. This hierarchical net- 
work was generated by using the RSMOB model, which reproduces properly the topological 
features of the metabolic network, in particular the compound network. Our results indicate 
that by applying the line graph transformation to the hierarchical network, it is possible to 
extract topological properties of the reaction network, which is embedded in the metabolic 
network. The RSMOB model stores the adequate information of the reaction network and 
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the line graph transformation is one useful technique to evoke it. 

While C{k) scales as k~^'^ for the initial hierarchical network (compound network), we find 
C{k) ~ k'^-^^ for the transformed network (reaction network). This theoretical prediction was 
compared with the experimental data from the KEGG database, finding a good agreement. 
Our results indicate that the reaction network is a degree-independent clustering network. 
Furthermore, the weak scaling of C{k) for the reaction network suggests us that this network 
has not hierarchical organization. 

On the other hand, we have also conducted an analytical derivation for the clustering 
coefficients C{k) and C{N). Expressions for these coefficients were calculated before and 
after the line graph transformation is applied to the hierarchical network. The agreement 
obtained by using these expressions was found acceptable, and consequently, they could be 
useful for further analyses. 



The line graph transformation has recently been applied on metabolic networks to 
study the scale-free topology of the reaction network, and on the protein-protein interac- 
tion network to detect functional clusters The work done here is another important 
application of this interesting technique. 
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